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EFFICIENT  ALGORITHMS  FOR  FINDING  MAXIMUM  MATCHINGS  IN 


CONVEX  BIPARTITE  GRAPHS  AND  RELATED  PROBLEMS 
W.  Lipski,  Jr.  and  F.  P.  Preparata 


1.  Introduction 

Matching  problems  constitute  a traditionally  important  topic  in 
combinatorics  and  operations  research  [8  ] and  have  been  the  object  of 
extensive  investigation.  Particularly  interesting  is  the  problem  of  finding 
a maximum  matching  in  a bipartite  graph,  which  is  stated  as  follows:  Let 
G ■ (A,B,E)  be  an  undirected  bipartite  graph,  where  A and  B are  sets  of 
vertices,  and  E is  a set  of  edges  of  the  form  (a,b)  with  a € A and  b € B. 

A subset  MCE  is  a matching  if  no  two  edges  in  M are  incident  to  the  same 
vertex;  M is  of  maximum  cardinality  (or  simply,  maximum)  if  it  contains  the 
maximum  number  of  edges.  As  noted  by  Hopcroft  and  Karp  [7],  this  problem  has 
many  applications,  such  as  the  chain  decomposition  of  a partially  ordered 
set,  the  determination  of  coset  representatives  in  groups,  etc.  Hopcroft  and 
Karp  have  also  developed  the  best  known  algorithm  for  this  problem. 

A special  instance  of  the  problem,  with  some  industrial  applications, 
was  originally  discussed  by  Glover  [6]  and  referred  to  as  matching  in  a 
convex  bipartite  graph.  A bipartite  graph  G is  convex  on  A if  an  ordering 
of  the  elements  of  A can  be  found  so  that  for  any  b 6 B and  distinct 
a^  and  a2  in  A (with  a^^  < a2) 

(a^.b)  € E and  (a2,b)  € E =»  (a,b)  € E for  any  a € A such  that  a1  < a < a2 
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In  other  words,  G is  convex  on  A when  there  is  an  ordering  on  A such  that 
for  any  b f B the  set  of  vertices  of  A connected  to  b forms  an  interval  in 
this  ordering.  In  such  a bipartate  graph  we  let  BEG[b]  and  END[b]  denote 
the  "smallest"  and  "largest"  elements  in  the  interval  of  the  elements 
of  A connected  to  b.  Naturally,  if  b € B is  isolated,  the  set  A(b)  is  empty 
and  BEG[b]  » END[b]  * A,  the  empty  symbol.  In  what  follows  we  assume  that 
there  is  no  isolated  vertex  in  B. 

When  this  property  holds,  the  maximum  matching  problem  is  considerably 
easier  to  solve.  In  fact  Glover  proved  that  the  following  simple  procedure 
yields  a maximum  cardinality  matching  (we  assume  that  both  A and  B be  given 
as  sequences  of  integers  from  1 to  |A  | and  |b|  respectively;  MATCH[i]  denotes 
the  element  of  B matched  to  i £ A): 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 


Algorithm  0 
for  i:  - 1 to  I A ! do 

begin  U:  = (k:(i,k)  £ E and  k has  not  been  deleted  from  b] 
if  D i1  $ then  (*  find  j 6 U to  be  matched  to  i *) 
begin  j:  * element  in  U with  minimum  value  of  END 
MATCHfi]  : - j 
Delete  j from  B 

end 

else  MATCH[ i] : * A (*  i unmatched  *) 

end 


In  words,  element  i of  A is  matched  to  an  available  element  j of  B whose 
corresponding  interval  ends  the  closest  to  i.  The  most  time  consuming  task 
of  this  algorithm  is  the  formation  of  the  set  U and  the  associated  determina- 
tion of  an  element  j € U with  the  smallest  value  of  END[j]:  for  any  given 
1 6 A,  it  involves  scanning  all  the  elements  of  B connected  to  i.  Thus  the 
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running  time  of  this  task  is  clearly  0 ( | E | ) , as  pointed  out  by  Lawler  [8]. 

In  this  paper  we  shall  describe  a considerably  more  efficient 
implementation  of  Glover's  rule  and  investigate  both  specializations  and 
generalizations  of  the  original  matching  problem.  Specifically  after 
considering  (Section  2)  the  maximum  matching  problem  in  a convex  bipartite 
graph,  we  shall  analyze  the  further  simplifications  which  are  possible 
when  the  graph  is  doubly  convex  (Section  3),  and  the  optimal  time 
determination  of  the  maximum  set  of  independent  vertices  associated  with  a 
given  maximum  matching  (Section  4).  Finally  (Section  5),  we  succinctly 
describe  two  generalizations  of  the  convex  matching  problem  and  an  extension 
of  the  techniques  to  weighted  matching,  which  directly  applies  to  the 
solution  of  a scheduling  problem. 


2 . Maximum  matching  In  convex  bipartite  graphs:  an  efficient  implementation 


of  Glover's  rule. 

Let  G = (A,B,E)  be  a bipartite  graph  convex  on  A,  with  | A | = m and 

| B | = n.  As  before,  A=  [1,2 m]  and  B = {1,2,.. .,n].  For  b € B, 

A(b)  c A denotes  the  set  [a:(a,b)  € E} ; similarly,  for  a 6 A,  B(a)  c B 
denotes  the  set  [b:(a,b>  € E] . Again,  we  assume  that  A is  ordered  so  that, 
for  each  b € B,  A(b)  is  the  interval  [BEG[b],  END[b]].  Notice  that  if  the 

set  A is  not  initially  ordered  so  that  the  property  of  convexity  is  manifest, 

the  bipartite  graph  G can  be  tested  for  possession  of  this  property  - and, 
if  so,  rearranged  - in  time  0(|E|+m+n)  by  means  of  the  Booth-Lueker  algorithm 
[2]. 

We  begin  by  giving  a generalization  (and  simpler  proof)  of  Glover's  rule. 
Lemma  1.  If  (a,b)  6 E and  A(b)C  A(c),  for  any  c € B(a),  then  there  is  a 
maximum  matching  containing  (a,b). 

Proof.  Suppose  M is  a maximum  matching  not  containing  (a,b).  If  a is 
unmatched  then  we  may  replace  the  edge  of  the  matching  incident  to  b with 
(a,b),  similarly  if  b is  unmatched.  Suppose  therefore  that  (a,c),  (d,b)  € M 
for  some  c € B,  d € A.  Since  d € A(b)  c A(c),  it  follows  that  (d,c)  € E, 
and  we  may  replace  (a,c),(d,b)  by  (a,b),(d,c)  (see  Figure  1). 


Figure  1.  To  the  proof  of  Lemma  1.  Wiggly  edges  belong  to  the  matching. 
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In  order  to  prove  that  Algorithm  0 correctly  finds  a maximum  matching, 
let  us  denote  by  the  graph  obtained  from  G by  deleting  1,..., i-1  from  A 
and  MATCH[ 1] , . . . ,MATCH[ i-1]  from  B,  together  with  the  edges  incident  to  all 
these  vertices.  Let  M^,  be  the  set  of  edges  matched  by  Algorithm  0 to 
vertices  l,...,i  in  A (we  put  Mg  *■  0),  and  let  A^(b)  and  B^(a)  be  defined 
for  G^  in  the  same  way  as  A(b)  and  B(a)  were  defined  for  G.  We  say  that  M^ 
can  be  extended  to  a maximum  matching  of  G if  there  is  a maximum  matching 
M of  G containing  M^;  this  means  that  M is  the  union  of  and  of  a maximum 
matching  of  G^+^. 

Assume  inductively  that  a < m and  that  ^ can  be  extended  to  a 
maximum  matching  of  G.  (This  is  trivially  true  for  a=l,  since  Mg  is  empty 
and  Gg  coincides  with  G.)  We  shall  prove  that  M&  can  also  be  extended  to 
a maximum  matching  of  G.  This  is  obviously  true  if  Bg(a)  = 0,  so  assume 
that  Ba(a)  ^ 0,  whence  Algorithm  0 chooses  MATCH[a]  = b ^ A.  It  is  then 
sufficient  to  show  that  there  is  a maximum  matching  of  Gg  containing 
(a,b).  But  this  is  immediate,  since  for  any  c in  Ba(a)  we  have 
A (c)  ■ [a,END[c]];  by  line  4 of  Algorithm  0,  we  have  END[b]  < END[c]  for 
any  c ^ b in  B (a),  whence  A (b)  c A (c),  and,  by  Lemma  1,  the  claim  is 
established. 

As  noted  earlier,  efficiency  can  be  achieved  if  for  a given  a € A 
the  computation  of  j € ®fl(a)  ^or  which  END[j]  is  minimum  can  be  sped-up. 

We  shall  now  show  that,  by  some  additional  preprocessing  and  the  use  of 
appropriate  data  structures,  this  can  be  done  in  time  which  is  sublogarithmic 
in  the  size  of  B. 
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The  basic  idea  is  to  try  to  store  the  set  B^i)  of  unmatched  vertices 
of  B connected  to  a currently  inspected  vertex  i 6 A on  a priority  queue, 
so  that  the  element  j 6 B to  be  matched  to  i can  be  found  as  the  least  element 
of  the  queue.  This  is  indeed  possible  if  the  elements  of  B are  relabelled 

so  that  END[l]  < ...  < END[n].  Then  the  least  element  of  the  priority 

queue  minimizes  the  value  of  END,  as  required  by  Glover's  rule.  In  order 
to  complete  the  description  of  our  implementation,  we  should  specify  a 
method  of  updating  the  priority  queue,  so  that  its  content  is  changed  from 
B^i)  to  Bi+1(i+l)  as  i is  increased  by  one.  It  is  easy  to  see  that  we 
should  delete  the  least  element  from  the  queue  (the  vertex  to  be  matched 
to  i),  then  delete  all  vertices  k € B with  ENDfk]  = i and  finally  insert  all 
vertices  k € B with  BEG[k]  = i+1.  Deleting  vertices  is  easy,  since  the  set 
of  vertices  k € B with  END[k]  = i appears  as  an  interval  in  our  ordering  of 
B.  Inserting  vertices  can  be  made  easy  too,  if  we  precompute  an  array 
ORDBEG[l:n]  containing  the  vertices  of  B sorted  according  to  the 
parameter  BEG,  so  that  BEG[ ORDBEG[ 1] ] < ...  < BEG[ ORDBEG[n] ] ; then  the  set 

of  vertices  k $ B with  BEG[ k]  = i is  stored  in  an  interval  of  consecutive 

positions  of  ORDBEG.  Notice  that  both  relabelling  of  vertices  in  B so 
that  END[1]  < ...  < END[n]  and  computing  the  array  ORDBEG  can  be  done  in 
time  0(m+n)  by  standard  bucket  sorting  (see  e.g.  [1  ]),  since  in  both  cases 
there  are  n items  to  be  sorted  by  a key  which  may  assume  values  from 
m. 


integers  1,..., 
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Next  we  may  take  advantage  f the  fact  that  the  elements  in  the 
priority  queue  are  integers  in  the  range  [l,n]  and  employ  the  priority 
queue  structure  developed  by  van  Emde  Boas  [3  , 4 ] , which  allows  each  of  the 
standard  queue  operations  to  be  performed  in  time  O(loglogn)  and  uses 
space  0(n). 

We  can  now  formally  describe  the  matching  algorithm,  where: 

QUEUE  denotes  the  just  mentioned  priority  queue  a la_  van  Emde  Boas  (with 
associated  operations  MIN,  DELETE,  INSERT,  EXTRACTM1N);  MATCH[l:m] 
ORDBEG[l:n],  BEG[ 1 : n] , and  END[l:n]  are  arrays  of  integers , the  integer 
variables  nb  and  ne  are  counters  referring  to  the  arrays  ORDBEG  and  END, 
respectively  (nb-1  and  ne-1  count  respectively  the  number  of  beginnings 
and  ends  of  intervals  [BEG[k] ,END[k] ] found  so  far. 

Algorithm  1 (Finding  maximum  matching  in  convex  bipartite  graph) 

Input:  BEG[ 1 : n] , END[ 1 : n] , ORDBEG[ 1 :n] 

END[1]  < ...  < END[n],  BEG[ORDBEG[  l]]  < ...  < BEG[ORDBEG[n]] 

Output : MATCH[l:m] 

(Algorithm  on  next  page) 


I 


i 
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1 begin 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17  end 


QUEUE : = 0 , nb : = ne : = 1 

for  i:  = 1 to  m do 

begin  (*find  vertex  to  be  matched  to  i*) 

while  (nb  < n)  and  (BEG[ ORDBEGfnb] ] = i)  do 
begin  INSERT (0RDBEG[ nb] ) 
nb:  = nb  + 1 

end 

if  QUEUE  =0  then  MATCH[ i]  : = A (*i  unmatched*) 
else  begin  MATCH[ i] : = MIN 
EXTRA CTM IN 

end 

while  (ne  < n)  and  (END[ne]  = i)  do 
begin  DELETE (ne) 
ne : = ne+1 

end 


From  the  viewpoint  of  performance,  notice  that  each  term  of  MATCH[l:m] 
is  processed  exactly  once  (lines  8 or  9),  for  a total  work  0(m),  while  each 
term  of  B is  inserted  into  the  queue  once  (line  5)  and  extracted  once 
(lines  10  or  13).  So  we  conclude  that  the  running  time  of  Algorithm  1 is 


0 (m  + nloglogn). 


3 . Maximum  matching  In  doubly  convex  bipartite  graphs 

As  noted  by  Glover,  the  maximum  matching  problem  becomes  even  simpler 
when  the  bipartite  graph  G is  doubly  convex,  i.e.,  orderings  of  both  A and  B 
exist  such  that  every  A(b)  is  an  interval  of  A and  every  B(a)  is  an  interval 
of  B. 

As  before,  we  assume  that  G be  given  as  a bipartite  graph  convex  on  A, 
that  is,  as  a set  [<  BEG[b] ,END[b]  > : b € b}  representing  intervals  of  A. 

A preliminary  task  is  to  test  whether  the  set  B can  be  reordered  so  that 
for  each  a € A the  set  B(a)  be  an  interval  of  B. 

Pictorially,  we  may  display  G by  means  of  a set  of  segments  (Figure  2a): 
specifically,  in  the  plane  (x,y),  we  let  the  segment  y = b,  BEG[b]  < x<  END[b] 
represent  the  interval  A(b)  (in  the  sequel  this  will  be  briefly  referred 
to  as  segment  b).  If  we  next  join  the  extremes  of  adjacent  segments,  i.e., 

introduce  in  this  diagram  edges  (BEG[i] ,BEG[ i+l] ) and  (END[i] ,END[t+l] ) , 

f 

for  i = l,2,...,n-l,  the  set  of  segments  is  enveloped  by  two  polygonal 
lines  called  the  left  and  right  boundaries . which  together  with  the  first 
and  last  segments  of  the  given  set  form  a simple  polygon.  In  this 
representation,  G is  convex  on  B if  the  intercept  of  a vertical  line  with 
this  polygon  consists  of  a single  segment:  thus  G is  convex  on  B if  and 
only  if  the  segments  can  be  rearranged  so  that  both  boundaries  are  bitomic . 
as  shown  in  Figure  2d  (that  is,  in  the  resulting  relabelling  of  elements 
of  B,  for  some  1 < r^  < n,  BEG[l]  i ...  ^ BEG[r^]  and  BEG[r^]  < ...  < BEG[n] ; 
similarly  for  some  1 < ^ n,  END[1]  < ...  < ENDC^]  and 

ENDCrj]  ^ ...  ^ END[n]).  We  shall  now  describe  a linear  time  - hence 

optimal  - algorithm  which  tests  G for  double  convexity  and,  if  this  property 
holds,  produces  the  desired  ordering  of  B. 

! i 

i -A 


left 
boundary 


Figure  2.  Different  polygons  corresponding  to  the  same  set  of  segments. 

(a)  arbitrary  order;  (b),(c)  ordered  by  nonincreasing  BEG; 
(d)  ordered  to  exhibit  double  convexity. 


In  the  rest  of  this  section  we  shall  always  assume  that  the  convex 
bipartite  graph  G under  consideration  is  connected.  In  fact,  it  is  very 
easy  to  find  connected  components  of  a convex  bipartite  graph.  Ct  is 
sufficient  to  scan  vertices  i € A in  increasing  order  and  to  count  the 
number  of  beginnings  and  the  number  of  endings  of  intervals  found  up  to 
vertex  i.  Each  time  these  two  counts  coincide,  a new  connected  component 
is  found.  With  the  elements  of  B labelled  so  that  END[ l]  < ...  < END[n] , 
and  with  the  array  ORDBEG  as  in  Algorithm  1,  the  determination  of  connected 
components  can  be  done  in  0(m+n)  time. 
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Referring  to  Figure  2d,  it  is  easy  to  see  that  the  polygon  displaying 
the  double  convexity  of  an  arbitrary  G consists  - up  to  the  reversal  of  the 
ordering  of  B - of  three  regions  (not  all  simultaneously  empty):  a middle 
region,  where  both  left  and  right  boundaries  are  nondecreasing  (i.e., 
both  BEG[j]  and  END[j]  are  nondecreasing  with  increasing  j,  assuming  that 
the  labelling  of  elements  of  B coincides  with  the  bottom  to  top  ordering  of 
segments  in  the  given  geometric  representation);  a top  region  where  the  left 
and  right  boundaries  are  nondecreasing  and  nonincreasing,  respectively;  a 
bottom  region  where  the  left  and  right  boundaries  are  nonincreasing  and 
nondecreasing,  respectively.  Moreover,  all  segments  of  the  top  region  are 
nested,  starting  with  the  topmost  segment  of  the  middle  region,  similarly, 
all  segments  of  the  bottom  region  are  nested,  starting  with  the  bottommost 
segment  of  the  middle  region. 

It  is  easy  to  see  that  our  description  need  not  define  the  three  regions 
uniquely,  if  there  are  different  elements  in  B with  the  same  value  of  BEG 
or  END;  to  guarantee  the  uniqueness  we  require  that  all  segments  in  the 
bottom  region  have  BEG[j]  > min^  < nBEG[k],  and  all  segments  in  the 
top  region  have  END[j]  < max^  < k<  ^ENDfk]. 

Suppose  that  we  initially  index  the  elements  of  B so  that  the  pairs 
<BEr-[  j]  ,END[ j]>,  j * l,...,n  are  in  lexiographic  ascending  order;  this  can 
be  done  by  bucket  sorting  these  elements  on  the  parameter  BEG,  and  then 
(stably)  bucket  sorting  the  resulting  sequence  on  the  parameter  END,  all  in 
time  0(m+n).  Once  this  ordering  of  segments  [A(b)  :b  £ B]  is  available 
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(see  Figure  2b),  we  shall  first  extract  from  it  the  subsequence  of  segments 
to  be  assigned  to  the  middle  region.  To  complete  the  test,  we  must  verify 
whether  the  remaining  segments  can  be  successfully  assigned  to  either  top 
or  the  bottom  regions.  Since  for  segments  in  these  regions,  the  orderings 
BEG  and  END  are  contragradient,  we  must  preliminarily  alter  the  order  of  the 
segments  not  assigned  to  the  middle  region,  so  that  for  any  two  such 
consecutive  segments  j and  j+l,(BEG[j]  * BEG[  j+l]  ) =»  (END[j]  ^ END[  j+l]  ) : 
this  can  be  obviously  done  in  linear  time  by  a straightforward  use  of  a 
stack  (Figure  2c).  Next,  we  must  test  whether  the  resulting  sequence  can  be 
partitioned  into  two  subsequences,  for  each  of  which  the  parameter  END  is 
nonincreasing:  if  this  is  feasible,  then  the  two  subsequences  of  segments 
will  respectively  form  the  top  and  bottom  regions.  More  exactly,  we  should 
do  the  partitioning  in  such  a way,  that  the  resulting  subsequences  of 
segments  be  nested  as  previously  explained.  We  guarantee  this  by  assigning 
the  extremal  segments  of  the  middle  region  to  the  sequence  to  be  partitioned. 

The  whole  task  is  performed  by  the  following  algorithm,  which  computes 
for  each  segment  j a parameter  Y[j]  denoting  its  order  in  the  final 
arrangement.  This  algorithm  also  makes  use  of  a special  subroutine,  which  - 
if  at  all  possible  - partitions  in  linear  time  a sequence  of  integers  into 
two  nonincreasing  subsequences;  for  example,  (4,6,3, 5,4)  is  partitioned 
into  (4,3)  and  (6,5,4).  This  simple  subroutine  is  described  formally  in  an 
appendix.  Its  additional  feature,  which  is  important  for  the  correctness 
of  our  algorithm,  is  that  the  first  term  of  the  sequence  is  assigned  to 
the  first  subsequence. 


! 
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Algorithm  2 (Testing  for  double  convexity  of  a connected  convex 
bipartite  graph) 

Input:  BEGrirn] ,END[l:n] 

The  pairs  <BEG[ j] ,END[ ]]>,  j - l,...,n  are  in  lexicographic 
increasing  ordering 
Output : Yf 1 : n] 

Vertices  j € B relabelled  so  that  for  1 < j < n 

BEG[J]  < BEG[j+l],  or  BEG[J]  - BEG[ j+l]  and  END[ j]  * END[ j+l] 


* 

i 

1 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 


14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 


begin  (*  find  last  segment  jm  of  middle  region  *) 

Jm:  ■ 1 

for  j : ■ 2 to  n do 

if  END[jl  i END[jm]  then  jm:  - j 
(*  extract  segments  not  in  internal  part  of  middle  region  *) 
e:  - END[1]  , X:  - 0 
for  1 : - 1 to  n do 

if  (ENDC  jT  a e)  and  ( j »*1 ) and  (jjtjm)  then  e:  = END[j] 
else  begin  L : - l + 1 
S[X]  : - J 

end 

relabel  the  elements  of  B so  that  for  1 < j < n 
(BEG[ j]  - BEG[j+l])  * (END[ j]  * END[ J+l]  ) 
reorder  STlr/J  so  that  for  1 < p < l 

(BEG[S[p]]  - BEG[  S[  p+l] ] ) =»  (END[Sf  p] ] ^ END[S[p+l]]  ) 
partition  Sflri]  into  two  subsequences  SUBlfl:Xl]  and 
SUB2[1:X2J,  such  that  ENDfSUBl[l]]  s ...  i ENDf SUBl[/l] ] 
and  ENDr SUB2[ 1]]  s ...  s END[SUB2[x2]] 
kl:  - k2:  - k3:  - 1 

for  J:  * 1 to  n do  (*  determine  Y[j]  *) 

if  SUBl[kl]  then  (*  j belongs  to  bottom  region  *) 
begin  Y[j] : - £l  - kl  + 1 
kl:  - kl  + 1 

end 

else  if  SUB2[k2]  - j then  (*  j belongs  to  top  region  *) 
begin  Y[J] : ■ n - £2+k2 
k2:  - k2+l 

end 

else  (*  J belongs  to  middle  region  *) 
begin  Y[ j] : - £2+k3 
k3 : - k3+l 
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It  is  straightforward  to  conclude  that  Algorithm  2 runs  in  time  0(n). 

We  can  now  describe  the  maximum  matching  algorithm,  which  makes  use 
of  a DEQUE  (doubly-ended -queue)  as  an  auxiliary  data  structure;  as  is  well- 
known,  DEQUE  has  two  distinguished  elements,  top  and  bottom,  and  the 
following  repertoire  of  instructions:  INSERTTOP,  DELETETOP,  INSERTBOTTOM, 
and  DELETEBOTTOM. 

Algorithm  3 (Finding  maximum  matching  in  doubly  convex  bipartite  graph) 
Input:  BEG[l:n],  END[l:n] , Y[l:n] 

BEG[ j]  < BEG[j+l],  or  BEG[ j]  - BEG[ j+1]  and  END[j]  s END[ j+1] 
for  1 < j < n 
Output:  MATCH[l:m] 


1 begin 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21  end 


DEQUE : *=  0 , J:  - 1 
for  i:  » 1 to  m do 

begin  (*  find  element  in  B to  be  matched  to  i € A *) 
while  (BEG[j]  = i)  and  (j  < n)  do 

begin  (*  insert  j into  deque  *) 

if  (DEQUE  - 0)  or  (Y[J]  > Y[top])  then  INSERTTOP (j ) 
else  INSERTBOTTOM(j) 

J:  - j+1 

end 

if  (DEQUE  =0)  then  MATCH[i]:  = A (*  i unmatched  *) 
else  if  END T top]  < END[bottom]  then 
begin  MATCHf il : = top 
DELETETOP 

end 

else  begin  MATCH[  i]  : *=  bottom 
DELETEBOTTOM 

end 

while  (DEQUE  t 0)  and  (END[ top]  = i)  do  DELETETOP 
while  (DEQUE  * 0)  and  (END[ bottom]  = i)  do  DELETEBOTTOM 

end 
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Notice  that  each  element  of  B is  inserted  into  and  deleted  from  the 
DEQUE  exactly  once,  and  that  each  of  the  standard  deque  operations  can  be 
executed  in  constant  time;  it  follows  that  the  entire  matching  can  be 
computed  in  time  0(m+n). 


* h 


4.  Finding  a maximum  independent  set  of  vertices  in  a convex  bipartite 


Elyi 
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to  finding  all  the  vertices  of  G which  are  reachable  from  B^.  A most 
interesting  fact  we  shall  now  show  is  that,  when  G is  convex,  this  reachable 
set  can  be  obtained  in  time  0(n+m)  so  that  the  determination  of  a maximum 
independent  set  runs  in  total  time  O(m+nloglogn) , or  0(m+n)  if  G is 
doubly  convex,  the  computation  of  the  maximum  matching  being  the  dominant 
task  (notice  that,  once  and  are  known,  I is  obtainable  in  time 
O(n-Kn) ) . 

As  usual,  the  graph  G is  described  by  the  two  arrays  BEG[l:n]  and 
END[l:n];  MATCH[l:n]  gives  for  each  i 6 A either  A or  the  element  of  B 
matched  to  it.  We  assume  that  the  elements  of  B be  ordered  so  that 
BEG[i]  < BEG[i+l],  1 < 1 < n.  Due  to  the  property  of  convexity,  for  each 
b ( Bq  the  set  A(b)  of  vertices  reachable  by  a single  edge  from  it  form  an 
interval  of  A;  from  any  matched  vertex  a in  this  interval  we  reach  a single 
vertex  MATCHta]  6 B,  which  in  turn  reaches  another  interval  A(MATCH[a])  of 
A.  Notice  thar  A(b)  and  A(MATCH[a])  necessarily  overlap,  so  by  the 
convexity  of  G their  union  is  a single  interval.  Therefore,  initially  we 
place  in  a queue  all  the  elements  of  Bq  in  increasing  order,  and  starting  with 
the  smallest  one  j^,  we  determine  a single  extended  interval  A*(1^)  2 A(j^) 
of  A,  which  is  the  set  of  all  elements  of  A which  are  reachable  from  j 
(A*(j1)  could  be  informally  viewed  as  the  "closure"  of  A ( j ^ ) ) . This 
extended  interval  is  constructed  by  scanning  A(j^)  in  decreasing  order 
starting  from  END[j^]  and  currently  updating  the  extremes  of  the  reached 
interval;  once  the  scanning  reaches  the  lower  extreme  without  further 
downward  extension  of  the  interval,  then  if  the  interval  has  been  extended 
upward  beyond  END[j^],  scanning  is  resumed  in  ascending  order  starting  from 
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END[jj]  until  the  same  terminating  condition  occurs,  and  this  process 
is  repeated  until  no  further  extension  - either  downward  or  upward  - is 
possible.  At  this  point  the  construction  of  interval  A*(j^)  has  been 
completed.  We  then  extract  the  next  element  from  the  queue  and  begin 
the  construction  of  A*(j2>.  Notice  that  if  A*(j^)  and  A ( j ^ ) are  disjoint 
(Figure  4a),  BEGt^]  must  be  larger  than  the  upper  extreme  of  A*(j^),  Since 
by  hypothesis,  BEG[j^]  < BEG[j2]»  it  follows  that  only  downward  extensions 


O 

O 


(b) 


I D 

Figure  4.  (a)  Illustration  of  the  case  where  A*(j^)  and  ACj^)  are  disjoint. 

(b)  Explanation  of  the  meaning  of  variables  "lower",  "upper",  ^ 

L and  u. 

( 
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°f  A(j£)  may  meet  previously  scanned  elements  of  A.  To  avoid  any  time- 

consuming  unnecessary  repeated  scanning,  we  must  ensure  than  any  previously 

scanned  interval  be  skipped  in  subsequent  processing,  so  that  each  element 

of  A be  scanned  at  most  once.  This  objective  is  achieved  by  means  of  a 

stack:  as  soon  as  the  construction  of  A*(j),  for  some  j € B^,  is  completed, 

its  lower  and  upper  extremes  are  inserted  into  the  stack,  whose  content  - 

at  a generic  instant  - is  a sequence  -l,i^,e. ,i_,e2, . . . .i^e^,  such  that, 

k 

for  l<p<k,  e_.  + l<i  . [i  ,e  ] is  an  interval  of  A,  and  U [i  ,e  1 

p p+1’  p’  pJ  p^L  p’  pJ 

is  the  set  of  all  scanned  elements  of  A.  The  reachability  algorithm  uses 
as  auxiliary  data  structures  a QUEUE,  containing  the  elements  of  Bq  ordered 
according  to  nondecreasing  value  of  BEG,  and  a STACK,  for  storing  the 
sequence  of  scanned  intervals,  as  already  noted.  The  intuitive  significance 
of  the  program  variables  lower,  upper,  l,  and  u is  as  follows  (see  Figure  4b) 
lower  and  upper  denote  respectively  the  current  boundaries  of  the  extended 
interval  being  constructed;  l and  u are  pointers  used  in  scanning,  running 
downward  and  upward  respectively. 


Algorithm  4 (Finding  the  set  of  vertices  in  A reachable  by 
alternating  paths  from  the  set  of  unmatched  vertices  in  B in  a convex 
bipartite  graph) 

Input : BEG[  1 : n"] , END[l:n],  MATCH[  1:  m] 

QUEUE  containing  the  unmatched  vertices  b € B in  increasing  order 


BEG[l]  < . . . < BEG[  n] 
k 

Output : The  set  U [i  ,e  ] c A of  vertices  reachable  from  unmatched 

p=l  P P 

vertices  b ( B,  represented  by  a sequence  -1, i^,e^, ^.e^, . . • , i^.e^ 


stored  on  STACK 
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1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 


begin 

STACK  « -1 

while  QUEUE  ^0  do  (*  find  vertices  reachable  from  f irst(QUEUE)*) 
begin  j « QUEUE 

if  END[j]  > top(STACK)  then  (*  new  vertices  to  be  scanned  *) 
begin  0:  = END[j]+l,  lower:  = BEG[ j] , u:  = upper:  = END[j] 
repeat  (*  extend  interval  of  vertices  reached  from  j *) 
while  0 > lower  do  (*  scan  downward  *) 
begin  0:  = 0-1 

if  MATCH[0]  ^ A then  (*  0 is  matched  *) 
begin  lower:  = min  (lower,  BEG[MATCH[0]] ) 
upper;  = max  (upper,  END[MATCH[0]] ) 

end 

if  0 < top(STACK)+l  then  (*  skip  interval  *) 
begin  0 * STACK 
0 * STACK 

lower:  = min(lower,0) 

end 

end 

while  u < upper  do  (*  scan  upward  *) 
begin  u:  = u+1 

if  MATCH[u]  £ A then  (*  u is  matched  *) 
begin  lower:  = min(lower,  BEG[MATCH[u]] ) 
upper:  = max(upper,  END[MATCH[u]} ) 

end 

end 

until  (0=lower)  and  (u=upper)  (*  extended  interval  completed  *) 
STACK  * lower 
STACK  <=  upper 

end 

end 

end 


To  analyze  the  performance  of  Algorithm  4,  we  note  that  each  element 
of  A is  scanned  at  most  once  (either  by  loop  8 or  by  loop  20);  the  extremes 
of  extended  intervals  are  pushed  into  (lines  28  and  29)  and  popped  from 
STACK  (lines  15  and  16)  at  most  once,  thereby  allowing  the  conclusion  that 
the  algorithm  runs  in  time  0(m+n). 


5 . Generalizations  and  related  problems 


In  this  section  we  shall  briefly  describe  two  interesting  generalizations 
of  the  notion  of  a convex  bipartite  graph  to  which  Glover'  rule,  and  hence 
the  efficient  algorithms  previously  described,  are  applicable,  and  an 
extension  of  the  techniques  to  a weighted  matching  problem,  which  models 
a significant  scheduling  application, 

5 .1,  Simple  chessboards:  a generalization  of  doubly  convex  bipartite  graphs 
Algorithm  3 can  be  applied  to  a class  of  convex  bipartite  graphs  more 
general  than  that  of  doubly  convex  graphs.  In  order  to  describe  this  class 
we  shall  need  some  definitions.  By  a chessboard  we  shall  mean  any  finite 
collection  of  unit  squares  with  integer  coordinates  on  a plane.  Any  such 
unit  square  will  be  denoted  by  coordinates  <x,y>  of  its  left  lower  corner. 

A chessboard  is  simple  if  for  any  of  its  squares  <x,yj>,  <x,y2>,  where 
y^  <:  it  contains  all  squares  <x,y>,  y^  < y < y^  (see  Figure  5 ).  Rows 

and  columns  of  a chessboard  are  defined  in  the  natural  way  as  maximal 
horizontal  and  vertical  sequences  of  adjacent  squares,  respectively.  We  may 
allow  a simple  chessboard  to  be  cut  vertically  in  some  places  to  make 
some  squares  nonadjacent  (such  as  <6,8>  and  <7,8>  in  Figure  5),  provided  the 
line  along  which  we  cut  touches  the  boundary  of  the  chessboard.  Let  A and  B 
be  the  set  of  columns  and  rows  of  a simple  chessboard,  respectively,  and 
let  us  consider  the  bipartite  graph  G = (A,B,E),  where  (a,b)  € E iff  column 
a and  row  b intersect  (i.e.,  have  a square  in  common).  This  graph  is 
convex  on  A (but  not  necessarily  doubly  convex), the  required  ordering  of 
A being  given  by  the  natural  lef t-to-right  ordering  of  columns.  It  is 
easily  seen  that  any  matching  in  G corresponds  to  a set  of  nonattacking 
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rooks  on  this  chessboard  (see  Figure  5).  If  the  row  of  a simple  chessboard 

consists  of  squares  <x,Y[j]>,  BEG[j]  < x<  END[ j] , then  the  maximum 
cardinality  set  of  nonattaching  rooks  on  this  chessboard  is  found  by 
Algorithm  3 in  time  linear  in  the  number  of  rows  and  columns.  The  reason 
why  Algorithm  3 works  correctly  is  that  similarly  to  the  doubly  convex 
case, the  sequence  of  ends  of  rows  "seen"  from  any  column  of  a simple 

chessboard  is  bitonic,  whence  the  sequence  of  the  values  of  END  for 

vertices  j € B (rows  of  the  chessboard)  stored  on  the  DEQUE  is  also 
bitonic,  and  we  may  find  a vertex  with  the  minimal  value  of  END  either  at 

the  top  or  at  the  bottom  of  the  DEQUE.  We  leave  details  to  the  reader. 


Figure  5.  A simple  chessboard  with  a maximum  set  of  nonattacking 
rooks  found  by  Algorithm  3. 


L 
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5.2.  Bipartite  graphs  convex  on  a tree-ordered  set 
Glover's  rule  works  correctly  in  a more  general  situation,  where  the 
sets  A(b),  b € B are  (sets  of  vertices  of)  paths  in  a directed  tree  (for 
concreteness  we  shall  assume  that  the  tree  is  directed  toward  the  root; 
families  of  sets  of  this  type  are  of  some  importance  in  file  organization, 
see  [9]).  The  convex  case  is  easily  seen  to  correspond  to  a tree 
degenerating  into  a single  path.  Assume  that  a directed  tree  with  vertex 
set  A is  represented  by  an  array  s[l:m]  which  gives  the  successor  S[a]  of 
any  vertex  a 6 A (S[a]  = A if  a is  the  root).  Similarly  as  in  the  convex 
case,  let  A(b)  be  represented  by  the  pair  <BEG[b] ,END[b]>,  meaning  that  A(b) 
is  the  set  of  vertices  of  the  path  in  the  tree,  beginning  at  BEG[b]  and 
ending  at  END[b]  . From  the  array  S we  can  easily  produce,  in  0(m)  time, 
a topological  ordering  of  A,  i.e.,  a linear  ordering  of  the  elements  of  A, 
in  which  the  distance  to  the  root,  or  the  rank  of  a vertex,  is  nonincreasing. 
We  may  also  assume  that  the  predecessors  of  any  vertex  appear  consecutively 
in  this  ordering,  and  that  if  a^  appears  earlier  than  then  all  predecessors 
of  a^  appear  earlier  than  all  predecessors  of  a^.  This  is  always  the  case  if 
the  ordering  is  found  by  a breadth-first  search  of  the  tree.  The  algorithm 
for  finding  a maximum  matching  in  our  bipartite  graph  processes  the 
vertices  of  A according  to  the  just  described  ordering  and  runs  as  follows. 
Instead  of  a single  priority  queue,  we  maintain  a collection  of  priority 
queues;  at  any  instant  in  the  execution  of  the  algorithm  there  are  as  many 
distinct  queues  as  there  are  vertices  of  A with  the  same  value  of  rank 
currently  being  processed.  Each  time  we  encounter  a vertex  i £ A which  is 
a leaf  of  the  tree  we  initialize  a new  priority  queue  and  insert  into  it  all 
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vertices  j € 3 with  BEG[j]  = i;  each  time  we  have  processed  all  predecessors 
of  a vertex  a,  we  merge  the  queues  corresponding  to  them  into  one  queue 
corresponding  to  a.  All  other  details  are  the  same  as  in  Algorithm  1.  The 
reason  why  our  procedure  works  correctly  is  as  follows.  The  priority  queue 
Q corresponding  to  a vertex  a contains  all  so  far  unmatched  vertices  b € B 
such  that  a € A(b).  The  paths  starting  at  a and  ending  at  vertices  END[b], 
b in  Q,  are  nested  one  in  another,  exactly  as  in  the  convex  case,  whence  the 
same  agrument  based  on  Lemma  1 can  be  applied  to  prove  that  matching  a to 
the  vertex  b in  Q with  the  minimal  value  of  END  guarantees  that  the  matching 
obtained  will  be  of  maximal  cardinality. 

If  we  apply  the  mergeable  heap  structure  described  by  van  Emde  Boas  [3], 
which  allows  the  priority  queues  to  be  efficiently  merged,  then  we  can 
achieve  0(m  + A(n)nloglogn)  time  complexity,  where  A(n)  is  the  functional 
inverse,  very  slowly  growing,  of  a function  of  Ackerman  type  (see  Tarjan 
[11]). 

Our  algorithm  can  be  used  to  find  a maximum  set  of  nonattacking  rooks 
on  a chessboard  satisfying  the  following  condition:  any  two  squares  <x,y^> 
oc.y^  can  be  joined  by  a sequence  <x,y^>  = <x^\y^^>,oc^)  ,y(2)>, , , . ,<x(k)  ,y(k)>  = 
<x,y2>  of  adjacent  (i.e.,  having  an  edge  in  common)  squares  with  x^ ^ s x, 

1 < i < k.  In  words,  the  chessboard  does  not  branch  as  we  go  from  left  to 
right  (see  Figure  6 ).  The  tree-like  ordering  of  the  set  A of  columns  of 
such  a chessboard  is  defined  so  that  a column  containing  square  <x+l,y> 
is  the  successor  of  column  containing  square  <x,y>. 
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Suppose  that  there  is  a weight  w(b)  s 0 associated  with  every  b € B, 
and  that  we  are  looking  for  a matching  which  maximizes  the  sum  of  weight  of 
matched  vertices  in  B.  Since  assignable  subsets  of  B - i.e.,  subsets  that 
can  be  covered  by  a matching  - form  a matroid,  it  follows  that  the  matching 
we  are  looking  for  can  be  found  by  a matroid  greedy  algorithm  (see  Lawler  [8j 
for  the  explanation  of  all  notions  related  to  matroids).  More  exactly, 
our  matching  can  be  obtained  as  follows:  (i)  order  the  vertices  in  B 
according  to  nonincreasing  weight,  (ii)  starting  with  the  empty  matching, 
scan  B in  this  order;  for  any  b 6 B,  augment  the  current  matching  along  an 
alternating  path  starting  at  b and  ending  at  an  unmatched  vertex  in  A,  if 
such  a path  exists,  or  leave  b unmatched  otherwise.  Notice  that  after  the 
augmentation  process  in  step  (ii),  vertices  which  were  matched  remain  matched 
(probably  to  different  vertices),  and  vertices  which  were  left  unmatched 
before,  remain  unmatched.  It  can  be  proved  (Gale  [5],  see  also  [8]),  that 
the  matching  M so  obtained  is  Gale-optimal,  i.e.  optimal  in  the  following 
strong  sense:  Let  {b^,...^^}  C B,  w(b^)  £ ...  s w(b^)  be  the  set  of 
vertices  covered  by  M.  Then  for  any  other  matching  M' , the  set  [c^,...,c  } c B, 
w(c^)  i ...  i w(c^)  vertices  covered  by  M'  satisfies  the  condition 
l < k,  w(b^)  a w(c^) , . . . »w(b^ ) a w(c^).  (Notice  that  both  the  greedy 
algorithm  and  the  notion  of  Gale-optimality  depend  only  on  the  ordering  of 
B according  to  the  weights,  and  not  on  the  actual  values  of  the  weights.) 

It  is  obvious  that  a Gale-optimal  matching  of  a convex  bipartite  graph 
can  be  obtained  in  0(n(m+n))  time  by  the  greedy  algorithm,  using  a modification 
of  Algorithm  4,  as  explained  at  the  beginning  of  this  subsection. 
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There  is  an  interesting  relationship  between  Gale-optimal  matchings 
in  convex  bipartite  graphs  and  the  problem  of  scheduling  a set  B of  n 
independent  (no  precedence  constrains)  tasks  on  one  processor,  where  each 
task  takes  one  unit  of  processing  time,  there  is  a starting  time  BEG[j] 
and  deadline  END[j]  for  every  task  j,  and  a penalty  p(j)  which  must  be 
paid  if  this  task  is  not  executed  in  the  time  interval  [BEG[ j] ,END[ j]] 

(we  assume  that  time  is  integer-valued).  It  is  easy  to  see  that  any 
schedule  minimizing  the  total  penalty  corresponds  to  a Gale-optimal  matching 
in  a convex  bipartite  graph  defined  by  arrays  BEG, END,  and  with  w(j)  = M-p(j) 

(M  > max^  < j < nP C J ) ) : the  vertex  i matched  to  task  J ( B determines  the 
unit  interval  of  time  when  j is  to  be  executed  (see  Lawler  [ 8]»  Chapter  7). 

We  conclude  that  an  optimal  schedule  for  this  problem  can  be  obtained  in 
0(n(m+n))  time  (m  is  the  maximal  deadline).  Of  course,  if  all  penalties 
are  equal,  i.e.,  when  we  simply  maximize  the  number  of  tasks  executed, 
then  the  optimal  schedule  can  be  obtained  in  O(m+nloglogn)  time  by  Algorithm  1. 

As  a closing  remark,  we  note  that  the  maximum  matching  problem  on  a general 

bipartite  graph  G corresponds  to  the  situation  where  for  any  b € B the  set 

A(b)  is  a collection  of  t(b)  intervals  of  A.  It  is  an  almost  straightforward 

extension  of  our  discussions  in  Sections  2 and  4,  to  show  that  the  standard 

approach  based  on  augmenting  paths  [8]  can  be  implemented  - both  for  the 

maximum  matching  and  for  the  Gale-optimal  matching  - in  time  O(n(m+tloglogn)) 

where  t ■ E t(b)  is  the  total  number  of  intervals  in  the  given  G. 
b € B 
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Appendix 

Algorithm  A (Partitioning  a sequence  of  n integers  into  two  non- 
increasing subsequences) 

Input  ; S [ 1 : X ] - the  original  sequence 

Output : SUBl[l:£l],  SUB2[1:£2]  - two  nonincreasing  subsequences 
into  which  S[l:£]  is  partitioned  S [ 1 ] = SUBl[l] 


1 
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begin  £1 : = £2:  =0,  SUBl [0] : = SUB2 [0] : = ® 
for  i:  = 1 to  £ do 

if  S[i]  < SUBl[£l]  then  (*  add  S[i]  to  first  subsequence  *) 
begin  £1:  = £1  + 1 

SUBl  [£1] : = S [i] 

end 

else  if  S[i]  < SUB2 [£2]  then  (*  add  S[i]  to  second  subsequence  *) 
begin  £2 : = 12  + 1 

SUB2 [12] : = S [i] 

end 

else  stop  (*  no  partitioning  possible  *) 

end 


To  prove  the  correctness  of  the  algorithm,  first  notice  that  we  always 
have  SUBl[£l]  < SUB2[ji2],  the  inequality  being  strict  except  for  £1  = 12  = 0. 
If  now,  for  some  i,  we  reach  the  condition  SUB1[£1]  < SUB2[£2]  < S [ i] 

(line  11)  it  is  clear  that  the  original  S[l:£]  contains  an  increasing  sub- 
sequence of  length  3,  which  makes  impossible  its  partitioning  into  two 
nonincreasing  subsequences. 

One  may  note  that  the  algorithm  easily  generalizes  to  an  algorithm  for 
partitioning  an  arbitrary  sequence  of  length  i into  the  minimal  possible 
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number  of  nonincreasing  subsequences,  in  time  OCfclogd),  where  d is  this 
minimial  number  of  subsequences,  or  - equivalently  - the  maximal  length 
of  an  increasing  subsequence  in  the  given  sequence. 
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